Information Preserving Dimensionality Reduction
نویسندگان
چکیده
Dimensionality reduction is a very common preprocessing approach in many machine learning tasks. The goal is to design data representations that on one hand reduce the dimension of the data (therefore allowing faster processing), and on the other hand aim to retain as much task-relevant information as possible. We look at generic dimensionality reduction approaches that do not rely on much task-specific prior knowledge. However, we focus on scenarios in which unlabeled samples are available and can be utilized for evaluating the usefulness of candidate data representations. We wish to provide some theoretical principles to help explain the success of certain dimensionality reduction techniques in classification prediction tasks, as well as to guide the choice of dimensionality reduction tool and parameters. Our analysis is based on formalizing the often implicit assumption that “similar instances are likely to have similar labels”. Our theoretical analysis is supported by experimental results.
منابع مشابه
Semantic Preserving Data Reduction using Artificial Immune Systems
Artificial Immune Systems (AIS) can be defined as soft computing systems inspired by immune system of vertebrates. Immune system is an adaptive pattern recognition system. AIS have been used in pattern recognition, machine learning, optimization and clustering. Feature reduction refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encoun...
متن کاملNeighborhood Preserving Projections (NPP): A Novel Linear Dimension Reduction Method
Dimension reduction is a crucial step for pattern recognition and information retrieval tasks to overcome the curse of dimensionality. In this paper a novel unsupervised linear dimension reduction method, Neighborhood Preserving Projections (NPP), is proposed. In contrast to traditional linear dimension reduction method, such as principal component analysis (PCA), the proposed method has good n...
متن کاملSemi-supervised Sparsity Pairwise Constraint Preserving Projections based on GA
The deficiency of the ability for preserving global geometric structure information of data is the main problem of existing semi-supervised dimensionality reduction with pairwise constraints. A dimensionality reduction algorithm called Semi-supervised Sparsity Pairwise Constraint Preserving Projections based on Genetic Algorithm (SSPCPPGA) is proposed. On the one hand, the algorithm fuses unsup...
متن کاملConstraint-based sparsity preserving projections and its application on face recognition
Aiming at the deficiency of supervise information in the process of sparse reconstruction in Sparsity Preserving Projections (SPP), a semi-supervised dimensionality reduction method named Constraint-based Sparsity Preserving Projections (CSPP) is proposed. CSPP attempts to make use of supervision information of must-link constraints and cannot-link constraints to adjust the sparse reconstructiv...
متن کاملSupervised Composite Kernel Locality Preserving Projection Feature Extraction for Hyperspectral Image Classification
Locally preserving projection (LPP) does not take advantage of the spatial correlation of pixels in the image, and the pixels are considered as independent pieces of information. In this paper, a kernel based manifold learning feature extraction method which considers spatial relationship of neighboring pixels, called supervised composite kernel locality preserving projection (SCKLPP), is propo...
متن کامل2D Dimensionality Reduction Methods without Loss
In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015